Current (self-reported) fuel type
The numbers of observations with each current fuel type:
##
## Smokeles Smoky Wood_and_or_Plant
## 17 87 8
Primary analysis
Investigate the association with current (self-reported) fuel type in the LEX study participants, adjusting for known confounders and stove ventilation. The reference group for this analysis would be the smoky coal users. This would be a categorical analysis, and the results would be a p-value from the likelihood ratio (LR) test of a confounder-only model to a model including the exposure variables, as well as p-values for the contrast of each category of coal use (smokeless coal or plant/wood) to that of smoky coal. FDR correction should be used separately for each of these sets. The main interest would be in the coal-specific findings and perhaps less so in the results from the LR test.
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Smokeles}) + \beta_2 * I(\text{Wood_and_or_Plant}) \\
& + \beta_3 * county + \beta_4 * BMI + \beta_5 * ses + \beta_6 * edu + \beta_7 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2368 0.6340
## Hannum EAA 0.6304 0.6340
## PhenoAge EAA 0.5142 0.6340
## Skin&Blood EAA 0.4887 0.6340
## GrimAge EAA 0.0279 0.2232
## DNAmTL 0.5250 0.6340
## IEAA 0.3694 0.6340
## EEAA 0.6340 0.6340
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Smokeles}) + \beta_2 * I(\text{Wood_and_or_Plant}) \\
& + \beta_3 * county + \beta_4 * BMI + \beta_5 * ses + \beta_6 * edu + \beta_7 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations and the reference as the smoky fuel type.
The estimations of \(\beta_0\), \(\beta_1\) and \(\beta_2\) with given \(Y\) are shown below. The \(\beta_1\) and \(\beta_2\) can be interpreted as “the expected change of Y if switching form the smoky fuel type to the given fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Limit the analyses in the primary analysis to include only a single observation from each subject (no need for a mixed model). The rationale for this is that it is not so easy to obtain unbiased p-values from a mixed model for FDR testing. This can be remediated during FDR testing but would be good to check.
Full model: \[Y = \beta_0 + \beta_1 * I(\text{Smokeles}) + \beta_2 * I(\text{Wood_and_or_Plant}) + \epsilon\] Nested model: \[Y = \beta_0 + \epsilon\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2800 0.8200
## Hannum EAA 0.4890 0.8819
## PhenoAge EAA 0.8936 0.8936
## Skin&Blood EAA 0.5512 0.8819
## GrimAge EAA 0.1672 0.8200
## DNAmTL 0.8624 0.8936
## IEAA 0.3075 0.8200
## EEAA 0.6635 0.8847
Linear relation
Use a trend test to estimate a linear relation across use categories (1=wood, 2=smokeless coal, 3=smoky coal). Fit the equation: \[Y = \beta_0 + \beta_1 * fuel\_type + \epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -1.03 0.74 0.17 0.17
## AgeAccelerationResidualHannum -0.70 0.64 0.28 0.37
## AgeAccelPheno -0.06 0.65 0.93 0.93
## DNAmAgeSkinBloodClockAdjAge -0.08 0.53 0.88 0.88
## AgeAccelGrim -0.11 0.47 0.81 0.82
## DNAmTLAdjAge -0.02 0.03 0.60 0.60
## IEAA -0.98 0.67 0.15 0.15
## EEAA -0.72 0.81 0.38 0.47

Cumulative lifetime (self-reported) fuel type
The numbers of observations with each cumulative lifetime fuel type:
##
## Mix Smoky
## 82 37
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Mix}) \\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 * edu + \beta_6 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.3222 0.5415
## Hannum EAA 0.4061 0.5415
## PhenoAge EAA 0.6397 0.7311
## Skin&Blood EAA 0.9331 0.9331
## GrimAge EAA 0.0245 0.1960
## DNAmTL 0.3396 0.5415
## IEAA 0.0940 0.3760
## EEAA 0.2773 0.5415
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Mix}) \\
& + \beta_2 * county + \beta_3 * BMI + \beta_4 * ses + \beta_5 * edu + \beta_6 * curStove + \epsilon
\end{aligned}
\]
where \(Y\) is one of the epigenetic age accelerations and the reference as the smoky fuel type.
The estimations of \(\beta_0\) and \(\beta_1\) with given \(Y\) are shown below. The \(\beta_1\) can be interpreted as “the expected change of Y if switching form the smoky fuel type to the mix fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 * I(\text{Mix}) + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.3532 0.8414
## Hannum EAA 0.7909 0.8414
## PhenoAge EAA 0.8253 0.8414
## Skin&Blood EAA 0.8414 0.8414
## GrimAge EAA 0.1805 0.7220
## DNAmTL 0.6405 0.8414
## IEAA 0.0759 0.6072
## EEAA 0.6484 0.8414
Linear relation
Use a trend test to estimate a linear relation across use categories (1=mix, 2=Smoky coal). Fit the equation: \[Y = \beta_0 + \beta_1 * fuel\_type + \epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -0.88 0.95 0.36 0.36
## AgeAccelerationResidualHannum 0.21 0.80 0.79 0.79
## AgeAccelPheno 0.18 0.81 0.83 0.83
## DNAmAgeSkinBloodClockAdjAge 0.14 0.69 0.84 0.84
## AgeAccelGrim 0.75 0.57 0.19 0.19
## DNAmTLAdjAge -0.02 0.04 0.64 0.64
## IEAA -1.52 0.86 0.08 0.16
## EEAA 0.46 1.02 0.65 0.65

Childhood (self-reported) fuel type
The numbers of observations with each current fuel type:
##
## Mix Smokeles Smoky Wood
## 53 5 47 11
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Wood}) + \beta_2 * I(\text{Smokeles}) + \beta_3 * I(\text{Mix}) \\
& + \beta_4 * county + \beta_5 * BMI + \beta_6 * ses + \beta_7 * edu + \beta_8 * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0412 0.1099
## Hannum EAA 0.1426 0.1901
## PhenoAge EAA 0.2872 0.3282
## Skin&Blood EAA 0.1345 0.1901
## GrimAge EAA 0.0051 0.0408
## DNAmTL 0.4625 0.4625
## IEAA 0.0379 0.1099
## EEAA 0.1276 0.1901
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * I(\text{Wood}) + \beta_2 * I(\text{Smokeles}) + \beta_3 * I(\text{Mix}) \\
& + \beta_4 * county + \beta_5 * BMI + \beta_6 * ses + \beta_7 * edu + \beta_8 * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations and the reference as the smoky fuel type.
The estimations of \(\beta_0\), \(\beta_1\), \(\beta_2\), and \(\beta_3\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), and \(\beta_3\) can be interpreted as “the expected change of Y if switching form the smoky fuel type to the given fuel type, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Limit the analyses in the primary analysis to include only a single observation from each subject (no need for a mixed model). The rationale for this is that it is not so easy to obtain unbiased p-values from a mixed model for FDR testing. This can be remediated during FDR testing but would be good to check.
Full model: \[Y = \beta_0 + \beta_1 * I(\text{Wood}) + \beta_2 * I(\text{Smokeles}) + \beta_3 * I(\text{Mix}) + \epsilon\] Nested model: \[Y = \beta_0 + \epsilon\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.2833 0.6101
## Hannum EAA 0.3813 0.6101
## PhenoAge EAA 0.8336 0.8336
## Skin&Blood EAA 0.7398 0.8336
## GrimAge EAA 0.0146 0.1168
## DNAmTL 0.5919 0.7892
## IEAA 0.1220 0.4880
## EEAA 0.3340 0.6101
Linear relation
Use a trend test to estimate a linear relation across use categories (1=wood, 2=smokeless coal, 3 = mix coal, 4=smoky coal). Fit the equation: \[Y = \beta_0 + \beta_1 * fuel\_type + \epsilon\]
## coefficient std pval pval_BHadj
## AgeAccelerationResidual -0.70 0.50 0.16 0.16
## AgeAccelerationResidualHannum -0.50 0.42 0.24 0.30
## AgeAccelPheno -0.13 0.43 0.77 0.93
## DNAmAgeSkinBloodClockAdjAge 0.02 0.36 0.95 0.95
## AgeAccelGrim 0.27 0.30 0.37 0.37
## DNAmTLAdjAge 0.01 0.02 0.62 0.99
## IEAA -0.87 0.44 0.05 0.06
## EEAA -0.52 0.53 0.33 0.39

Clusters based on model-based exposure estimates at or shortly before the visit (clusCUR6)
The file “LEX_clusCUR6.csv” has information on current pollutant exposures, obtained for the 2 years preceding the visit. To reduce multi-collinearity between exposures, exposure prototypes were derived based on hierarchical cluster analysis in combination followed by principal components analysis. These estimates are available for 6 different prototypes (cluster variables) for a total of 161 subjects and 211 visits. The prototypes are labelled as:
CUR6_BC_PAH6 – Black carbon (BC) and 6 PAHs
CUR6_PAH31 – a large cluster of 31 PAHs
CUR6_NkF – NkF only
CUR6_PM_RET – Particulate matter (PM) and retene
CUR6_NO2 – NO2 only
CUR6_SO2 – SO2 only
Summary the exposure estimates:
## CUR6_BC_PAH6 CUR6_PAH31 CUR6_NkF CUR6_PM_RET
## Min. :-1.6472 Min. :-1.9531 Min. :-3.0963 Min. :-1.72458
## 1st Qu.:-0.5226 1st Qu.:-0.3924 1st Qu.:-0.5918 1st Qu.:-0.54260
## Median : 0.7938 Median : 0.3928 Median :-0.3663 Median :-0.30489
## Mean : 0.2134 Mean : 0.1962 Mean :-0.1059 Mean :-0.01324
## 3rd Qu.: 0.8098 3rd Qu.: 0.6301 3rd Qu.: 0.7448 3rd Qu.: 0.36146
## Max. : 1.6827 Max. : 2.5950 Max. : 2.2506 Max. : 2.60492
## NA's :13 NA's :13 NA's :13 NA's :13
## CUR6_NO2 CUR6_SO2
## Min. :-2.58032 Min. :-3.4207
## 1st Qu.:-0.44259 1st Qu.:-0.8550
## Median : 0.05849 Median :-0.2976
## Mean : 0.18535 Mean :-0.1312
## 3rd Qu.: 0.81282 3rd Qu.: 0.2025
## Max. : 2.27519 Max. : 1.6387
## NA's :13 NA's :13
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10} * edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1900 0.2533
## Hannum EAA 0.0239 0.0504
## PhenoAge EAA 0.0210 0.0504
## Skin&Blood EAA 0.1401 0.2242
## GrimAge EAA 0.0085 0.0504
## DNAmTL 0.2939 0.3359
## IEAA 0.4320 0.4320
## EEAA 0.0252 0.0504
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10} * edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations.
The estimations of \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) can be interpreted as “the expected change of Y if increase one unit of given exposure prototype, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.1840 0.2968
## Hannum EAA 0.2914 0.3775
## PhenoAge EAA 0.0241 0.0755
## Skin&Blood EAA 0.0283 0.0755
## GrimAge EAA 0.0263 0.0755
## DNAmTL 0.4823 0.4823
## IEAA 0.3303 0.3775
## EEAA 0.1855 0.2968
Likelihood ratio (LR) test (single model) with subjects using only smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4226 0.4830
## Hannum EAA 0.1558 0.2707
## PhenoAge EAA 0.0209 0.1672
## Skin&Blood EAA 0.1692 0.2707
## GrimAge EAA 0.0806 0.2149
## DNAmTL 0.2510 0.3347
## IEAA 0.6041 0.6041
## EEAA 0.0626 0.2149
Likelihood ratio (LR) test (single model) with subjects only using smoky coal
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_PAH6} + \beta_2 * \text{PAH31} + \beta_3 * \text{NkF} + \beta_4 * \text{PM_RET} + \beta_5 * \text{NO2} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.0700 0.0800
## Hannum EAA 0.0037 0.0148
## PhenoAge EAA 0.0093 0.0248
## Skin&Blood EAA 0.0426 0.0800
## GrimAge EAA 0.0651 0.0800
## DNAmTL 0.2166 0.2166
## IEAA 0.0509 0.0800
## EEAA 0.0019 0.0148
Clusters based on model-based exposure estimates accrued before age 18 (clusCHLD5)
The file “LEX_clusCHLD5.csv” has information on estimated pollutant exposures during early childhood. Estimates are available for 5 different prototypes (cluster variables) for a total of 161 subjects and 211 visits. The prototypes are labelled as:
CHLD5_X7 – a cluster of 7 air pollutants
CHLD5_X33 – a large cluster of 33 air pollutants
CHLD5_NkF – NkF only
CHLD5_NO2 – NO2 only
CHLD5_SO2 – SO2 only
Summary the exposure estimates:
## CHLD5_X7 CHLD5_X33 CHLD5_NkF CHLD5_NO2
## Min. :-2.00581 Min. :-2.0274 Min. :-3.7754 Min. :-2.0155
## 1st Qu.:-0.53325 1st Qu.:-0.5323 1st Qu.:-0.7656 1st Qu.:-0.5330
## Median : 0.10720 Median : 0.1218 Median :-0.2258 Median : 0.2416
## Mean :-0.01945 Mean : 0.1796 Mean :-0.1098 Mean : 0.2268
## 3rd Qu.: 0.51881 3rd Qu.: 1.1052 3rd Qu.: 0.7312 3rd Qu.: 0.8033
## Max. : 1.77388 Max. : 1.6650 Max. : 1.8887 Max. : 3.5648
## NA's :13 NA's :13 NA's :13 NA's :13
## CHLD5_SO2
## Min. :-1.38635
## 1st Qu.:-0.90192
## Median : 0.33208
## Mean : 0.01749
## 3rd Qu.: 0.44744
## Max. : 1.73751
## NA's :13
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} * edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4899 0.5599
## Hannum EAA 0.1305 0.2088
## PhenoAge EAA 0.0576 0.1782
## Skin&Blood EAA 0.0716 0.1782
## GrimAge EAA 0.0120 0.0960
## DNAmTL 0.5692 0.5692
## IEAA 0.4260 0.5599
## EEAA 0.0891 0.1782
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} * edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations.
The estimations of \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) can be interpreted as “the expected change of Y if increase one unit of given exposure prototype, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.8864 0.8864
## Hannum EAA 0.2901 0.5840
## PhenoAge EAA 0.1416 0.5664
## Skin&Blood EAA 0.3650 0.5840
## GrimAge EAA 0.0208 0.1664
## DNAmTL 0.5466 0.7288
## IEAA 0.6847 0.7825
## EEAA 0.3074 0.5840
Likelihood ratio (LR) test (single model) with subjects using only smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.9805 0.9805
## Hannum EAA 0.3840 0.7341
## PhenoAge EAA 0.0700 0.2800
## Skin&Blood EAA 0.1867 0.4979
## GrimAge EAA 0.0634 0.2800
## DNAmTL 0.5506 0.7341
## IEAA 0.7515 0.8589
## EEAA 0.4588 0.7341
Likelihood ratio (LR) test (single model) with subjects only using smoky coal
Full model: \[Y = \beta_0 + \beta_1 * \text{X7} + \beta_2 * \text{X33} + \beta_3 * \text{NkF} + \beta_4 * \text{NO2} + \beta_5 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.9268 0.9268
## Hannum EAA 0.3979 0.6366
## PhenoAge EAA 0.0655 0.3264
## Skin&Blood EAA 0.2424 0.6366
## GrimAge EAA 0.0816 0.3264
## DNAmTL 0.3621 0.6366
## IEAA 0.7955 0.9091
## EEAA 0.4873 0.6497
Clusters based on model-based lifetime exposure estimates (clusCUM6)
The file “LEX_clus CUM6.csv” has information on estimated cumulative pollutant exposures during the lifecourse. Estimates are available for 6 different prototypes (cluster variables) for a total of 161 subjects and 211 visits. The prototypes are labelled as:
CUM6_BC_NO2_PM – a cluster of BC, NO2, and PM
CUM6_PAH36 – a large cluster of 36 PAHs
CUM6_DlP – DlP only
CUM6_NkF – NkF only
CUM6_RET – retene only
CUM6_SO2 – SO2 only
Summary the exposure estimates:
## CUM6_BC_NO2_PM CUM6_PAH36 CUM6_DlP CUM6_NkF
## Min. :-2.1989 Min. :-2.0019 Min. :-2.4744 Min. :-2.34566
## 1st Qu.:-0.5606 1st Qu.:-0.5902 1st Qu.:-1.0232 1st Qu.:-0.84297
## Median : 0.2497 Median : 0.2500 Median :-0.5064 Median :-0.21091
## Mean : 0.1138 Mean : 0.2128 Mean :-0.2015 Mean :-0.04154
## 3rd Qu.: 0.8546 3rd Qu.: 1.1584 3rd Qu.: 0.7657 3rd Qu.: 0.43564
## Max. : 2.6510 Max. : 1.9951 Max. : 2.1588 Max. : 2.54737
## NA's :13 NA's :13 NA's :13 NA's :13
## CUM6_RET CUM6_SO2
## Min. :-2.44171 Min. :-1.75440
## 1st Qu.:-0.66614 1st Qu.:-0.68589
## Median :-0.20905 Median : 0.09033
## Mean :-0.09119 Mean :-0.06758
## 3rd Qu.: 0.40560 3rd Qu.: 0.35109
## Max. : 2.67607 Max. : 2.10707
## NA's :13 NA's :13
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10} * edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4844 0.4844
## Hannum EAA 0.3926 0.4844
## PhenoAge EAA 0.0933 0.3732
## Skin&Blood EAA 0.2405 0.3885
## GrimAge EAA 0.0011 0.0088
## DNAmTL 0.2155 0.3885
## IEAA 0.4703 0.4844
## EEAA 0.2428 0.3885
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2}\\
& + \beta_7 * county + \beta_8 * BMI + \beta_9 * ses + \beta_{10} * edu + \beta_{11} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations.
The estimations of \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), \(\beta_5\), and \(\beta_6\) can be interpreted as “the expected change of Y if increase one unit of given exposure prototype, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.5551 0.7401
## Hannum EAA 0.8488 0.8488
## PhenoAge EAA 0.1559 0.4157
## Skin&Blood EAA 0.2862 0.5202
## GrimAge EAA 0.0170 0.1360
## DNAmTL 0.1043 0.4157
## IEAA 0.3251 0.5202
## EEAA 0.7581 0.8488
Likelihood ratio (LR) test (single model) with subjects using only smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.6381 0.6381
## Hannum EAA 0.5536 0.6342
## PhenoAge EAA 0.0248 0.1984
## Skin&Blood EAA 0.1313 0.2626
## GrimAge EAA 0.1039 0.2626
## DNAmTL 0.0790 0.2626
## IEAA 0.4141 0.6342
## EEAA 0.5549 0.6342
Likelihood ratio (LR) test (single model) with subjects only using smoky coal
Full model: \[Y = \beta_0 + \beta_1 * \text{BC_NO2_PM} + \beta_2 * \text{PAH36} + \beta_3 * \text{DlP} + \beta_4 * \text{NkF} + \beta_5 * \text{RET} + \beta_6 * \text{SO2} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.6380 0.6380
## Hannum EAA 0.3701 0.4914
## PhenoAge EAA 0.0243 0.1944
## Skin&Blood EAA 0.0878 0.2054
## GrimAge EAA 0.1027 0.2054
## DNAmTL 0.0826 0.2054
## IEAA 0.4300 0.4914
## EEAA 0.3562 0.4914
Clusters based on pollutant measurements (clusMEAS6)
Clusters based on urinary biomarkers (clusURI5)
The file “LEX_clusURI5.csv” has information on measured urinary biomarkers obtained during each visit. Estimates are available for 5 different prototypes (cluster variables) for a total of 163 subjects and 186 visits. The prototypes are labelled as:
URI5_NAP_1M_2M – a cluster of Naphthalene, 1Methylnaphthalene, and 2Methylnaphthalene
URI5_ACE – Acenaphthene only
URI5_FLU_PHE – Fluoranthene and Phenanthrene_anth
URI5_PYR – Pyrene only
URI5_CHR – Baa_Chrysene only
Summary the exposure estimates:
## URI5_NAP_1M_2M URI5_ACE URI5_FLU_PHE URI5_PYR
## Min. :-2.12630 Min. :-3.01075 Min. :-1.97535 Min. :-2.27382
## 1st Qu.:-0.65236 1st Qu.:-0.60987 1st Qu.:-0.73402 1st Qu.:-0.25285
## Median : 0.07178 Median :-0.09045 Median : 0.03390 Median : 0.06614
## Mean : 0.01441 Mean :-0.05696 Mean :-0.04617 Mean :-0.01833
## 3rd Qu.: 0.55591 3rd Qu.: 0.68816 3rd Qu.: 0.52169 3rd Qu.: 0.49727
## Max. : 2.69501 Max. : 1.90667 Max. : 2.43408 Max. : 2.09455
## NA's :25 NA's :25 NA's :25 NA's :25
## URI5_CHR
## Min. :-3.86540
## 1st Qu.:-0.48920
## Median :-0.03949
## Mean :-0.01131
## 3rd Qu.: 0.47587
## Max. : 2.39391
## NA's :25
Primary analysis
Likelihood ratio (LR) test (mix model)
Full model: \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 * \text{CHR}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} * edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] Nested model: \[
\begin{aligned}
Y = & \beta_0 \\
& + \beta_1 * county + \beta_2 * BMI + \beta_3 * ses + \beta_4 * edu + \beta_5 * curStove + \epsilon
\end{aligned}
\] \(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.4408 0.6169
## Hannum EAA 0.4716 0.6169
## PhenoAge EAA 0.0226 0.1808
## Skin&Blood EAA 0.8595 0.8595
## GrimAge EAA 0.0945 0.2520
## DNAmTL 0.0815 0.2520
## IEAA 0.5398 0.6169
## EEAA 0.5041 0.6169
Linear regression
In the following section, we performed linear regression with equation \[
\begin{aligned}
Y = & \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 * \text{CHR}\\
& + \beta_6 * county + \beta_7 * BMI + \beta_8 * ses + \beta_{9} * edu + \beta_{10} * curStove + \epsilon
\end{aligned}
\] where \(Y\) is one of the epigenetic age accelerations.
The estimations of \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) with given \(Y\) are shown below. The \(\beta_1\), \(\beta_2\), \(\beta_3\), \(\beta_4\), and \(\beta_5\) can be interpreted as “the expected change of Y if increase one unit of given exposure prototype, while holding other variables constant”.

Sensitivity analyses
Likelihood ratio (LR) test (single model)
Full model: \[Y = \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.7166 0.9146
## Hannum EAA 0.8605 0.9146
## PhenoAge EAA 0.0779 0.5412
## Skin&Blood EAA 0.5460 0.9146
## GrimAge EAA 0.2178 0.5808
## DNAmTL 0.1353 0.5412
## IEAA 0.7881 0.9146
## EEAA 0.9146 0.9146
Likelihood ratio (LR) test (single model) with subjects using only smoky or smokeless coal
Full model: \[Y = \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.5630 0.9535
## Hannum EAA 0.7871 0.9535
## PhenoAge EAA 0.1480 0.9535
## Skin&Blood EAA 0.9124 0.9535
## GrimAge EAA 0.8240 0.9535
## DNAmTL 0.5162 0.9535
## IEAA 0.7267 0.9535
## EEAA 0.9535 0.9535
Likelihood ratio (LR) test (single model) with subjects only using smoky coal
Full model: \[Y = \beta_0 + \beta_1 * \text{NAP_1M_2M} + \beta_2 * \text{ACE} + \beta_3 * \text{FLU_PHE} + \beta_4 * \text{PYR} + \beta_5 * \text{CHR} + \epsilon\]
Nested model: \[Y = \beta_0 + \epsilon\]
\(H_0\): The full model and the nested model fit the data equally well. Thus, you should use the nested model.
\(H_A\): The full model fits the data significantly better than the nested model. Thus, you should use the full model.
P-values results:
## p_vals p_vals_BHadj
## Horvath EAA 0.7266 0.9357
## Hannum EAA 0.8797 0.9357
## PhenoAge EAA 0.1130 0.9040
## Skin&Blood EAA 0.6715 0.9357
## GrimAge EAA 0.8736 0.9357
## DNAmTL 0.6495 0.9357
## IEAA 0.8091 0.9357
## EEAA 0.9357 0.9357
Ambient Exposure
Linear regression (simple model)
In the following section, we performed linear regression with equation \[Y = \beta_0 + \beta_1 *X + \epsilon\] where \(Y\) is one of the epigenetic age accelerations, and \(X\) is one of the ambient exposure measurements.
The estimations of \(\beta_1\) with given \(Y\) and \(X\) are shown below, which can be interpreted as “the mean of Y changes given a one-unit increase in X while holding other variables constant”.


## X: Ambient Exposure Measurements:
## bap pm25
## Horvath EAA -0.007 (0.0048) 0.0018 (0.0024)
## Hannum EAA -0.0023 (0.0042) 0.0001 (0.002 )
## PhenoAge EAA -0.002 (0.0045) -0.001 (0.0022)
## Skin&Blood EAA 0.0029 (0.0034) -0.0005 (0.0017)
## GrimAge EAA 0.0017 (0.0031) 0.0013 (0.0015)
## DNAmTL 0 (0.0002) -0.0001 (0.0001)
## IEAA -0.0098* (0.0042) 0.0015 (0.0022)
## EEAA -0.0012 (0.0054) 0.0003 (0.0026)
## [P<0.001: ***; P<0.01: **; P<0.05: *] <NA>
## ANY BPE BaA
## Horvath EAA 0.0009 ** (0.0003) -0.0072 (0.0046) -0.0042 (0.0029)
## Hannum EAA 0.0004 (0.0003) -0.0032 (0.004 ) -0.0015 (0.0026)
## PhenoAge EAA 0.0006 * (0.0003) -0.0037 (0.0043) -0.0004 (0.0027)
## Skin&Blood EAA 0.0004 (0.0002) 0.0026 (0.0032) 0.0013 (0.0021)
## GrimAge EAA 0.0007 ***(0.0002) 0.0013 (0.0029) 0.0014 (0.0019)
## DNAmTL 0 (0 ) 0 (0.0002) 0 (0.0001)
## IEAA 0.0006 * (0.0003) -0.0103* (0.004 ) -0.0055* (0.0026)
## EEAA 0.0006 (0.0003) -0.0023 (0.0051) -0.0007 (0.0032)
## <NA> <NA> <NA>
## BbF BkF CHR
## Horvath EAA -0.0034 (0.003 ) -0.0197 (0.0133) -0.0034 (0.0032)
## Hannum EAA -0.0017 (0.0026) -0.0082 (0.0117) -0.0017 (0.0028)
## PhenoAge EAA -0.0009 (0.0028) -0.0063 (0.0124) -0.0001 (0.003 )
## Skin&Blood EAA 0.0014 (0.0021) 0.0063 (0.0094) 0.0013 (0.0022)
## GrimAge EAA 0.0013 (0.0019) 0.0058 (0.0085) 0.0019 (0.002 )
## DNAmTL 0 (0.0001) 0.0001 (0.0005) 0 (0.0001)
## IEAA -0.0049 (0.0026) -0.0269* (0.0117) -0.0048 (0.0028)
## EEAA -0.0009 (0.0033) -0.0047 (0.0148) -0.0008 (0.0035)
## <NA> <NA> <NA>
## DBA FLT FLU
## Horvath EAA -0.0193 (0.0117) -0.0061* (0.0031) 0.0008 (0.0008)
## Hannum EAA -0.0103 (0.0103) -0.0025 (0.0027) -0.0003 (0.0007)
## PhenoAge EAA -0.0046 (0.011 ) -0.0008 (0.0029) 0.0007 (0.0007)
## Skin&Blood EAA 0.0043 (0.0083) -0.0002 (0.0022) 0.0005 (0.0005)
## GrimAge EAA 0.0044 (0.0075) 0.0012 (0.002 ) 0.0009 (0.0005)
## DNAmTL -0.0001 (0.0005) 0.0001 (0.0001) 0 (0 )
## IEAA -0.0267* (0.0102) -0.0064* (0.0027) 0.0006 (0.0007)
## EEAA -0.0075 (0.0131) -0.0024 (0.0034) -0.0001 (0.0009)
## <NA> <NA> <NA>
## IPY NAP PHE
## Horvath EAA -0.0112 (0.0082) 0.0002 ** (0.0001) 0.0006 (0.0005)
## Hannum EAA -0.0031 (0.0072) 0.0001 (0.0001) -0.0001 (0.0004)
## PhenoAge EAA -0.0054 (0.0077) 0.0001 * (0.0001) 0.0005 (0.0005)
## Skin&Blood EAA 0.0055 (0.0058) 0.0001 (0 ) 0.0003 (0.0004)
## GrimAge EAA 0.0041 (0.0053) 0.0001 ***(0 ) 0.0006 * (0.0003)
## DNAmTL 0.0001 (0.0003) 0 (0 ) 0 (0 )
## IEAA -0.0173* (0.0072) 0.0001 * (0.0001) 0.0005 (0.0004)
## EEAA -0.0021 (0.0092) 0.0001 (0.0001) 0 (0.0006)
## <NA> <NA> <NA>
## PYR
## Horvath EAA -0.0054 (0.003 )
## Hannum EAA -0.002 (0.0026)
## PhenoAge EAA -0.0006 (0.0028)
## Skin&Blood EAA 0.0002 (0.0021)
## GrimAge EAA 0.0012 (0.0019)
## DNAmTL 0.0001 (0.0001)
## IEAA -0.006 * (0.0026)
## EEAA -0.0018 (0.0034)
## <NA>
Urinary Measurements
Linear regression (simple model)
In the following section, we performed linear regression with equation \[Y = \beta_0 + \beta_1 *X + \epsilon\] where \(Y\) is one of the epigenetic age accelerations, and \(X\) is one of the urinary measurements.
The estimations of \(\beta_1\) with given \(Y\) and \(X\) are shown below, which can be interpreted as “the mean of Y changes given a one-unit increase in X while holding other variables constant”. 

## X: Urinary Measurements:
## Benzanthracene_Chrysene Naphthalene
## Horvath EAA -0.0361 (0.1332) -0.0015* (0.0006)
## Hannum EAA 0.0486 (0.1146) -0.0012* (0.0005)
## PhenoAge EAA -0.0481 (0.122 ) -0.0011 (0.0006)
## Skin&Blood EAA 0.084 (0.0959) -0.0015***(0.0004)
## GrimAge EAA 0.1136 (0.0845) 0 (0.0004)
## DNAmTL -0.0043 (0.005 ) 0 (0 )
## IEAA -0.1071 (0.1221) -0.001 (0.0006)
## EEAA 0.0996 (0.1458) -0.0012 (0.0007)
## [P<0.001: ***; P<0.01: **; P<0.05: *] <NA>
## 2.Methylnaphthalene 1.Methylnaphthalene Acenaphthene
## Horvath EAA -0.0076 (0.0072) -0.0105 (0.0175) 0.0223 (0.0388)
## Hannum EAA -0.004 (0.0063) -0.0191 (0.0149) 0.0363 (0.0332)
## PhenoAge EAA -0.0096 (0.0066) -0.021 (0.0159) 0.0897 * (0.0348)
## Skin&Blood EAA -0.0074 (0.0053) -0.0251* (0.0125) -0.0142 (0.0282)
## GrimAge EAA 0.0073 (0.0047) 0.0115 (0.0111) 0.0636 ** (0.0241)
## DNAmTL -0.0003 (0.0003) -0.0006 (0.0007) -0.0012 (0.0015)
## IEAA -0.0049 (0.0067) -0.0034 (0.016 ) 0.0423 (0.0355)
## EEAA -0.0024 (0.008 ) -0.017 (0.0192) 0.0427 (0.0424)
## <NA> <NA> <NA>
## Phenanthrene_Anthracene Fluoranthene Pyrene
## Horvath EAA 0.0031 * (0.0015) 0.0312 (0.0176) 0.0731 (0.8107)
## Hannum EAA 0 (0.0013) 0.0012 (0.0153) 0.5231 (0.7025)
## PhenoAge EAA 0.0016 (0.0014) 0.0193 (0.0163) 1.1583 (0.7147)
## Skin&Blood EAA 0.0002 (0.0011) 0.0076 (0.0129) 0.4467 (0.5627)
## GrimAge EAA 0.0029 ** (0.0009) 0.0336 ** (0.011 ) 0.9093 (0.501 )
## DNAmTL -0.0002** (0.0001) -0.002 ** (0.0007) -0.0252 (0.0306)
## IEAA 0.0026 (0.0014) 0.0252 (0.0162) 0.3662 (0.7378)
## EEAA 0.0005 (0.0017) 0.0062 (0.0195) 0.7881 (0.8911)
## <NA> <NA> <NA>